-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSPS-142 updates to help creating simulated reference panel and running imputation against it #1296
Conversation
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some thoughts on specifying the ref panel path / file names
@@ -40,12 +42,12 @@ workflow ImputationBeagle { | |||
|
|||
scatter (contig_index in range(length(contigs))) { | |||
# these are specific to hg38 - contig is format 'chr1' | |||
String reference_filename = reference_panel_path + "hgdp.tgp.gwaspy.merged." + contigs[contig_index] + ".merged.AN_added.bcf.ac2" | |||
String reference_basename = reference_panel_path + "sim.10k." + contigs[contig_index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should either pass in the basename as an input or add it to the reference_panel_path input so this can just be
String reference_basename = reference_panel_path + contigs[contig_index]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternately we could have the reference_panel_path
be the full path with a regex string to substitute the contig using [sub
] (https://github.com/openwdl/wdl/blob/main/versions/1.0/SPEC.md#string-substring-string-string). so the reference_panel_path
input would have to look like
https://lz8b0d07a4d28c13150a1a12.blob.core.windows.net/sc-94fd136b-4231-4e80-ab0c-76d8a2811066/hg38/simulated/sim.10k.<CONTIG>
and then this line would become String reference_basename = sub(reference_panel_path, "<CONTIG>", contigs[contig_index])
i'm not sure which would be less confusing. using the sub
solution would allow the existing ref panel paths to not be renamed.
b404f9b
into
TSPS-183_mma_beagle_imputation_hg38
…ng imputation against it (#1296) * add optional error count override for testing * rename reference base prefix variable and make it more user friendly --------- Co-authored-by: Jose Soto <[email protected]>
…ng imputation against it (#1296) * add optional error count override for testing * rename reference base prefix variable and make it more user friendly --------- Co-authored-by: Jose Soto <[email protected]>
…ng imputation against it (#1296) * add optional error count override for testing * rename reference base prefix variable and make it more user friendly --------- Co-authored-by: Jose Soto <[email protected]>
…ng imputation against it (#1296) * add optional error count override for testing * rename reference base prefix variable and make it more user friendly --------- Co-authored-by: Jose Soto <[email protected]>
* wip add beagle imputation stuff * add 2 wdls to dockstore.yml * fix docker gar url * use the right path for jars * wip on imputation wdl * oops use correct jar * missing equals * fix java call again * fix java call * oops match file names * update beagle jar to 01Mar24.d36 * debug GatherVcfs * debug GatherVcfs 2 * try to resolve missing file issue * don't impute over padding * make the index again * supply vcf_index input to SelectVariantsByIds * update Imputation wdl too * newlines * update for hg38 * Revert "update for hg38" This reverts commit 3757137. * update for hg38 * liftover wdl * remove GCP-specific vm commands * use gatk * fix suffix and basename * fix more filenames * remove missing contig stuff for now * fix ref panel path * another chr fix * warn on missign contig * do fail if missing contig * more mem * troubleshooting wld * fixed plink path * add select_first test * cleanup * add if block to test * create and use ref panel interval list * move interval list creation to ref panel wdl * give default values for optional inputs, weird * change CountVariants calls * test * add output to test * next test * more test * another test * update real task * TSPS-226 presplit and prechunk beagle inputs (#1272) *pre splitting and prechunking beagle imputation inputs to lower log numbers and storage account egress --------- Co-authored-by: Jose Soto <[email protected]> * TSPS-221 remove index input and add seed to make beagle tool deterministic (#1285) * remove multi sample vcf index workflow input and add it to the PreSplitVcf task. add seed number so that beagle is always deterministic. add comment to cpu input for PhaseAndImputeBeagle task * change output_callset_name to output_base_name and remove optional outputs * change n_failed_chunks ticket to an int --------- Co-authored-by: Jose Soto <[email protected]> * rename workflow * TSPS-241 Clean up beagle wdl (#1288) * clean up wdl with stuff from TSPS-241 * try to make fail fast work with double nested scatters --------- Co-authored-by: Jose Soto <[email protected]> * add specific gatk_docker * TSPS-142 updates to help creating simulated reference panel and running imputation against it (#1296) * add optional error count override for testing * rename reference base prefix variable and make it more user friendly --------- Co-authored-by: Jose Soto <[email protected]> * add maxRetries 2 to all imputation beagle tasks * add prechunk wdl to dockstore * use acr for default ubuntu image * add preemptible 3 * use acr gatk docker as default * don't use preemptibles on GatherVcfs * basename fix for imputation beagle ref panel generation (#1332) * try auto specifying chr at end of basename * both tasks * add liftovervcfs to dockstore * allow specifying max mem * TSPS-269 Speed up CountVariantsInChunksBeagle by using bedtools (#1335) * try creating bed files * try again * try again again * a different thing * use bedtools and bed ref panel files * oops update the correct task * fix * use the right freaking file name * remove comment * update pipeline version to 0.0.2 * TSPS-293: Fix up streaming imputation beagle (#1347) update ImputationBeagle * add array imputation quota consumed wdl (#1425) * add array imputation quota consumed wdl * add changelogs for imputation array related workflows --------- Co-authored-by: Jose Soto <[email protected]> * TSPS-239 get wdl running on 400k sample ref panel (#1373) * changes to help beagle imputation wdl run on a 400k sample reference panel --------- Co-authored-by: Jose Soto <[email protected]> * remove create imputation ref panel beagle wdl and changelog * PR feedback --------- Co-authored-by: Jose Soto <[email protected]> Co-authored-by: M. Morgan Aster <[email protected]> * add set -e -o pipefail to all relevant imputation tasks (#1434) Co-authored-by: Jose Soto <[email protected]> * TSPS-341 remove tasks for recovering variants not in the reference panel (#1468) * remove tasks for recovering variants not in the reference panel and separate out beagle tasks from imputation tasks * remove prechunk wdl and references to it remove "Beagle" from task names in BeagleTasks.wdl --------- Co-authored-by: Jose Soto <[email protected]> * Updated pipeline_versions.txt with all pipeline version information * [PR to feature branch] Add testing to imputation beagle (#1503) * TSPS-239 get wdl running on 400k sample ref panel (#1373) * changes to help beagle imputation wdl run on a 400k sample reference panel --------- Co-authored-by: Jose Soto <[email protected]> * remove create imputation ref panel beagle wdl and changelog * PR feedback --------- Co-authored-by: Jose Soto <[email protected]> Co-authored-by: M. Morgan Aster <[email protected]> * add new files for testing * add test wdl to .dockstore.yml * add test data json files, other updates * version to 1.0.0, update changelog * update beagle docker * update beagle docker again * fix call phase task * re-deleting ImputationBeaglePreChunk.wdl * temporarily try to run test on feature branch pr * remove vault inputs * update output basename for plumbing test * remove feature branch from gha pr branches * pr comments * add quotes in VerifyTasks.CompareVcfs * update dockers, move CreateVcfIndex to BeagleTasks --------- Co-authored-by: jsotobroad <[email protected]> Co-authored-by: Jose Soto <[email protected]> * Updated pipeline_versions.txt with all pipeline version information * remove newline at end of Utilities.wdl * remove LiftoverVcfs, add README for imputation_beagle * oops this commit adds the README for imputation_beagle * rename test inputs files to reflect contents * PR comments round 1 * Updated pipeline_versions.txt with all pipeline version information * update changelog for BroadInternalImputation * Updated pipeline_versions.txt with all pipeline version information * add back newline to Utilities.wdl with -w flag on changed file check * remove change to Minimac4 task * revert change to tool command in OptionalQCSites * fix fail task dependency, revert attempt to ignore newline in diff, other pr comments * update README for ImputationBeagle * rename test files * Updated pipeline_versions.txt with all pipeline version information * another commit for hashes * Updated pipeline_versions.txt with all pipeline version information * dummy commit * pr comments * Updated pipeline_versions.txt with all pipeline version information * dummy commit * dummy commit --------- Co-authored-by: jsotobroad <[email protected]> Co-authored-by: Jose Soto <[email protected]> Co-authored-by: GitHub Action <[email protected]> Co-authored-by: Nikelle Petrillo <[email protected]> Co-authored-by: npetrill <[email protected]>
Description
Dont think we actually want to merge this yet but this is the wdl i ran to test the simulated data through the beagle pipelines. we shoudl chat about how we want to pass the reference panel basename with the actual refernece panel
here is a run across all chromosomes using the 10k simulated reference panel. Note that an error override of 0 was used.
Checklist
If you can answer "yes" to the following items, please add a checkmark next to the appropriate checklist item(s) and notify our WARP documentation team by tagging either @ekiernan or @kayleemathews in a comment on this PR.